Asymptotic normality of integer compositions inside a rectangle 

Steffen Eger 
Carnegie Mellon University, School of Computer Science 

seger@cs . emu . edu 



o 

(N 



en 

r— I 

o 
u 

B 



Abstract 

Among all restricted integer compositions with at most m parts, each of which has size at most I, 
choose one uniformly at random. Which integer does this composition represent? In the current note, 
we show that underlying distribution is, for large m and I, approximately normal with mean value ^. 

1 Introduction 

An integer composition of a nonnegative integer n is, informally, a way of writing n as a sum of nonnegative 
integers tti, . . . , tt^, for some fc > 0. Let /i;,m(n) denote the number of integer compositions of the nonnegative 
integer n with at most m parts, each of which has size at most / ('compositions inside a rectangle'). Recently, 
Sagan (2009) p5 has shown that the sequence 

is unimodal. In Figurell] we plot this sequence for I = 2, m — 5; I = 6, m = 5; and / = 6, m = 20. Apparently, 
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Figure 1: The sequences /i/,,n(0),, 
m = 20 (right). 



,hi^rn{l'ni) for I = 2, m = 5 (left), I = 6, m = 5 (middle) and I — 6, 



as I and m increase, hi^m looks more and more 'Gaussian'. This suggests a probabilistic interpretation of 



hi^min), according to which the normalized values 



.(») 



0, . . . , /to, denote the probabilities that 



El=o'«i.™W' 
a uniform randomly chosen integer composition with at most m parts, each of which has size at most I, 

represents the integer n. In the current note, we show that these probabilities follow, for large I and m, 
approximately a normal distribution with mean value -^ and variance m- — ^2 — • 

Thereby, we first define multinomial triangles as a generalization of Pascal's triangle and characterize 
their entries, polynomial coejjicients, as generalizations of the well-studied binomial coefficients (Section 
^, whereupon we outline a recently found relationship between polynomial coefficients and specificially re- 
stricted integer compositions (Section^. The latter, with various types of restrictions, have attracted much 
attention in recent years (cf. [2], [1], [S], [10], [T3], [TS], [H]). For example, Malandro [13] determines asymp- 
totic formulas for L-restricted integer compositions — L being an arbitrary finite set — and Shapcott |16] 



and Schmutz and Shapcott [H] find a lognormal distribution for part products of restricted integer compo- 
sitions. Hitczenko and Stengle [llj derive the expected number of distinct part sizes of unrestricted random 
compositions. Restricted and unrestricted integer compositions have a variety of apphcations, ranging from 
the theory of patterns [5] to monotone paths in two-dimensional lattices ([Hj), alignments between strings 
([7j), and the distribution of the sum of discrete integer- valued random variables (i5]). 

Then, in Section |4l we state our main theorem, asymptotic normality of compositions inside a rectangle, 
which we prove in Section [Sj In the conclusion, we discuss generalizations of the analyzed setting where part 
sizes are restricted to lie within arbitrary finite sets. 

While our main result, perceived rightly, might be considered not very surprising, the steps that lead to it 



(Lemmas 5.1 to 5.5 1 may be judged interesting on their own (and are certainly novel) because they specify the 
exact distribution of the random variable Xi^m that sums the parts of a randomly chosen integer composition 
from a rectangle of size I x to, and give an elegant characterization of it in terms of the distribution of the 
sum of independent uniform random variables and an "error term" that quadratically tends toward zero. 

2 Multinomial triangles and polynomial coefficients 

In generalization to binomial triangles, {I + l)-nomial triangles, I > 0, are defined in the following way. 
Starting with a 1 in row zero, construct an entry in row fc, fc > 1, by adding the overlying {I + 1) entries in 
row {k — 1) (some of these entries are taken as zero if not defined); thereby, row k has {kl + 1) entries. For 
example, the monomial (/ = 0), binomial (Z = 1), trinomial (I = 2) and quadrinomial triangles (/ = 3) start 
as follows, 
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In the (/ + l)-nomial triangle, entry n, < n < kl, in row fc, which we denote by („),,, and refer to 



as 



polynomial coefficient (cf. Caiado (2007) [T], Comtet (1974) [3]), has the following interpretation. It is the 
coefficient of x" in the expansion of 



kl /, 

k 



Also note that, by its definition, („),,, satisfies the following recursion 



k\ J^ /k-1 



3 Integer compositions and polynomial coefficients 

An integer composition of a nonnegative integer n is a tuple n — (tti, . . . ,7Tk), fc > 0, of nonnegative 
integers such that 

n = TTl + . . . -I- TTfc 

where the tt^'s are called parts, and k is the number of partsr] Let C{n,k,a,b) denote the set of restricted 
compositions of n into k parts tt^ with a < iTi < b, where a,b € N U {oo} such that < a < 6, and let 

^Compositions where some parts are allowed to be zero are sometimes called weak compositions. 



c{n, k, a, b) denote its size, c{n, k, a, b) = \C{n, k, a, b)\. For example, for n — 5, k = 2, a = 0, b = oo, we have 

5 = 5 + = + 5 = 4 + 1 = 1+4 = 3 + 2 = 2 + 3, 

and thus c(5, 2, 0, oo) = 6. 

The foUowing resuhs are well-known. 



c{n, k, 0, oo) = 
c(n, k, 1, oo) = 



n + k — 1 

k-1 
n — 1 
k-1 



c{n, fc, a, oo) = c{n — ka, fc, 0, oo) 



n — ka + k — 1 
fc-1 



(3.1) 
(3.2) 
(3.3) 



Moreover, in recent work, Eger (2012) [3] has shown, more generally, a simple relationship between the 
number of restricted integer compositions and polynomial coefficients, namely. 



c{n, k,a, b) 



n — ka 



(3.4) 



b-a+l 



4 Main theorem 



Let rn be a positive integer and let I be a nonnegative integer. Denote by hi,m(n) the number of integer 
compositions of the integer n with at most m parts p, each of which has size at most I, i.e. < p < L Let 
Xi,rn be the random variable that takes on the integer n, for < n < Im, with probability 



Y^Hohi^mii) 



Theorem 4.1. Let //,,„ = ^ and let crf,^ = ^^^4^- Then 



oi,m\/m 



-^7V(0, 1) asZ,r7H-oo. 



Our strategy for proving Theorem 4.1 is as follows. First, we determine the exact distribution of Xi,n hi 



Lemma |5.1[ Then we derive the exact distribution of the sum of m independently and uniformly distributed 

which is, by the Central Limit Theorem, asymptotically a normal distribu- 



random variables in Lemma 15. 2 



tion. Next, Lemmas 5.3 and |5.4 provide inequalities and upper bounds that we require in Lemma 5.5 where 
we show that the distribution of Xi^m can be represented, roughly, as the sum of two parts: the distribution 
of the sum 5*1 + ... + Sm of m independently distributed uniform random variables (derived in Lemma 5.2) 
and an "error term" that converges quadratically toward zero in I. 



5 Proof of the main theorem 

Lemma 5.1. Let i, 1 < i < ttt,, be the smallest index such that n < il. Then, 



P[Xi,„, =n] = 



1 

{I + 1)™ - 






i+i 



Proof. By definition, hi^m{n) = J2T=i c{n,j,0,l) — Y^"^=i iOi+i^ where the last equahty follows from (3.4|. 
Moreover, c(n,j, 0, /) is obviously zero when j < i since n > {i — 1)1. Finally, the number of integers 
representable by j parts, each between and I, is obviously {I + 1)^. Therefore, 

Im Ira m m Im rn , _. 



i=0 i=0 i = l j = l 1=0 j=l 



Hence, 



^""" ^"E:-:^/^.™w"(^ + l^-l^ + lttw,+r• 






j(m) 



Lemma 5.2. Denote by Sf the sum ^i + . . . + 5^ of independent uniform random variables Sj, j ~ 
1, . . . , TO, each taking values from the set {0, . . . ,1}. The distribution of S[ is given by 

Proof. See Caiado [H, Eger 0. D 



Remark 5.1. Note that the expected value and the variance of S^ in Lemma 5.2 are given by 



ml 



2, .^^-Jr^]^mVar[S,]=m^-^±^^. 






Also note that, by the Central Limit Theorem, the distribution of Si is asymptotically normal. 

Now, we prove a fact well-known for binomial coefficients, namely, that the 'central' coefficient majorizes 
the remaining coefficients in a given row in the (multinomial) triangle. 

Lemma 5.3. Let fc > and Z > be integers. For all integers n such that < n < kl, 

k \ 

kl 

2 J/ l + l 

Proof By the representation of (,^);^^ as C^X+i = Ej=o («-i);+i ^'^ ^^'^ ^°^ " - ^ 



^ 'If J 



k\ f k 



"m+i V"- 1/ ;+i 



fc- 1\ / fc- 1 



n J i_^^ \n I 1/ /_|_i 



(5.1) 



Moreover, it is easy to show that polynomial coefficients are symmetric in the following sense. 



fc 
kl — n 



i+i 

k 



Therefore it suffices to show that the sequence (q); , ^i (i)i+i' ■ ■ • ' (l ^ l) ^^ non-decreasing. But by (5.1) 
this easily follows inductively, using the row number fc as induction variable. Importantly, note that, in (5.1 1, 
if n < [yj , then C'^^) , , , is defined and greater than zero for all fc > 2 since then n < [yj < (fc — 1)1. D 

In the following lemma, we write ak ^ 6^ as a short-hand for limfe_j.oo f^ = 1- Also note that the following 
lemma is a generalization of Stirling's approximation to the central binomial coefficient. 



Lemma 5.4. For all fixed I, 



(Z + 1)* 



\-\ , 

L 2 J/ ( + 1 



2^fcm^ 



Proof. See Eger [6^. D 

Lemma 5.5. For all / and m and for all n such that < n < ml, 

P[Xl,7n =n]= Jl,mP[Si =n]+ ei^rn, 

where ej^m is an "error term" that satisfies 

< ez,„, < 0(1'^) 
and 7;^m satisfies 

7i.™-(i + o(ri))"\ 

Proof. Let i, 1 < * < "^, be the smallest index such that n < il. Moreover, define a;,„i as a;^„i = 7I+TTs~ri i+j 
and note that ai^m = 7i,m (i+i)m ; where 7;^^ = (1 + 1/0^^ (ignoring the (—1) in the denominator of ai^m)- 
Then 



P[Xi^rn =n]= ai^,n ^ 



";,»! 



l+l 



l+l 



m—1 



(m) 



i + 1 



= 7/,>n^['S'i ■ = n] + e/, 



where we define e;^„i = q:;,„i X^fli in)i+i' Obviously, ei^ > 0. Moreover, by Lemmas 



5.3 



and 



5.4 



m — 1 



j=z 



e/,m < a/,m ^ ( ,jZ, ) < a^mO(l) ^ 



{I + ly 



L2JA+1 -^ U J2nj^-Wr^ 



(5.2) 



Now, 



so that 

m—l 

E 



(; + ly 



2.,m^i 



0(1) 



(? + ip 



{I + ly 



j=r v27rji^±i^ 



Eo(i) 



j=' 






-1 



jVC + i) 



j=* 



whence, continuing from (5.2) 



m— 1 



(; + 1)^- 



(^ + 1) 



i-2 



< 0(1)((/ + 1)^2 -{1 + i)*-'"-2J < C)(i)(; + i)-2. 



(5.3) 



D 



In TablefT] we show the decrease of ei^m in Lemma 5.5 as I increases. Obviously, our bound is apparently 
quite well, as in fact e/^m seems to approximately quadratically decay in I. In Figure [21 the distributions of 
Xi^rn and Si for different values of I and m are plotted. The variable Xi^m has a particular distributional 



shape that can be inferred from the proof of Lemma |5.5| For small values n the distribution of Xi^rn tends 
to be larger than that of S\ 
relation is reversed for large n. 



to be larger than that of S\ — ei^m is relatively larger as can be seen from Equation (5.3) — while this 







m = 10 




m = 20 




I = 1 




0.0471 




0.0240 




I = 2 
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1 = 8 
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9.5016 X lO--* 
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1 = 16 


5.5909 
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3.56 


2.6494 X 10"'' 


3.58 


1 = 32 


1.4871 


X lO-'^ 


3.75 


7.0291 X 10-5 


3.76 


l = 6A 


3.8399 


X 10-5 


3.82 


1.8126 X 10-5 


3.87 



Table 1: Maximum over absolute differences |P[X/^m = n] — P[Si = n]\, n = 0,. . . ,lm, for m = 10 and 
TO = 20 and varying I. We also specify the factor of decrease in these differences between successive I values. 






Figure 2: The distributions of Xim and S^ for to = 10 and I = 2 (left), I = 4 (middle), and / 



(right). 



6 Conclusion 



The choice of the restrictions < p < / for parts p of integer compositions has, although illustrating a 
model case, largely been arbitrary. In fact, similar results as Theorem |4.1| would hold for any finite set 
L = {ai, . . . , afc} as range for part sizes. For L = {a, a + 1, . . . ,b}, < a < b, we find simple closed form 



solutions of the asymptotic distribution of XL_rn, where we define X^^m (and other variables such as Sj^ ) as 
a generalization of Xi^m above with Xi^m — ^{o,. 

1 



^i}.m- For example, in this case, Sj^ has exact distribution 



1 



m 
n — ma 



6-a+l 



b- 

(cf. Eger (2012) [5 ) with expected value 

normally distributed. Conversely, the distribution of X^^m allows a similar representation as in Lemma 5.1 



i{a+b) 



and is, by the Central Limit Theorem, asymptotically 



as a sum of quantities (^^ .^) and a normalizing term, from which we can straightforwardly derive a 



decomposition of X^^m as in Lemma 5.5 with bounds obtained from Lemmas 5.3 and 5.4 



As a final remark, note that our results entail a 'Stirling' like formula for /li^m(n). By definition P[X, 



_ hi,m{n) 



T.t 



(i) 



and equating this quantity at its asymptotic mean value ^ with the corresponding normal 



density leads to 



ml 



((^ + 1) 



-l)¥ 



27rTO 



C+i) 
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